Big Data for Managers by Malviya Atal;Malmgren Mike;

Big Data for Managers by Malviya Atal;Malmgren Mike;

Author:Malviya, Atal;Malmgren, Mike; [Неизв.]
Language: eng
Format: epub
Publisher: Taylor & Francis (CAM)
Published: 2018-11-26T07:30:00+00:00


In many cases, entity extraction from raw text can be converted into an automated entity recognition where text can be parsed and classified. Entities are automatically selected from the text by the software. This is a very common requirement from businesses that are taking unstructured data seriously and have started to collect and work on raw data such as social media feeds or machine logs. In the previous list of uses, recognizing dates, times, monetary amounts and so on may be something that text analytics software can do out of the box, without you having to help figure out what these extractions are. There are obvious benefits of such an approach. Recently with the help of new Big Data storage and analytics tools, the time required to perform entity extraction has drastically reduced and it can be scaled across large volumes of text very quickly. The outcome of text analytics entity extraction is a group of structured data that can be merged with enterprise data for further analysis.

If there are known patterns such as email or phone numbers appearing in the text documents, then text analytics engines can easily identify and classify the content of documents. Text analysis can also be used for the density detection of words (often represented as a word cloud) and can identify the relations and connections with other words in the document. Business-specific patterns and entities can also be defined so text analytics engines can classify extracted words and sentences and keep them in the defined categories.

A typical social media or web-based text analytics engine reads and curates social and web conversational data in the form of text, images, video etc. using techniques like web scraping or accessing predefined social media data APIs such as GNIP. The engine then builds taxonomy-based classification based on initial learning data and rules defined by the chosen industry or vertical. Classification algorithms initially identify and classify text into categories using training sets of data and then progressively refine these classifications on the basis of new data and large volumes of data processing.

It is important to understand that semantic analysis is not just about word identification and classification – context before content is very important in the analysis. Context is identified using multiple taxonomies and connected ontologies. So, one keyword appearing in two different sentences will have totally different intentions or sentiments classified by the system, for example:

•This TV is large => holds Positive sentiment, as people like the large size of a TV

•This mobile is large => holds Negative sentiment, as people don’t like large mobiles



Download



Copyright Disclaimer:
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.